2018-04-16

You will learn in this session

  • how to handle temporal and spatial autocorrelation
  • that GLMM are not very difficult if you already know GLM and LMM
  • that random effects as well can have non Gaussian distribution
  • that there are even more general methods than GLMM out there: HGLM and DHGLM
  • how to handle heteroscedasticity

Temporal autocorrelation

Temporal autocorrelation in discrete (equaly spaced) time steps

The AirPassengers data

AirPassengers
##      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1949 112 118 132 129 121 135 148 148 136 119 104 118
## 1950 115 126 141 135 125 149 170 170 158 133 114 140
## 1951 145 150 178 163 172 178 199 199 184 162 146 166
## 1952 171 180 193 181 183 218 230 242 209 191 172 194
## 1953 196 196 236 235 229 243 264 272 237 211 180 201
## 1954 204 188 235 227 234 264 302 293 259 229 203 229
## 1955 242 233 267 269 270 315 364 347 312 274 237 278
## 1956 284 277 317 313 318 374 413 405 355 306 271 306
## 1957 315 301 356 348 355 422 465 467 404 347 305 336
## 1958 340 318 362 348 363 435 491 505 404 359 310 337
## 1959 360 342 406 396 420 472 548 559 463 407 362 405
## 1960 417 391 419 461 472 535 622 606 508 461 390 432

The AirPassengers data

plot(AirPassengers)

Reformating the dataset for the fit

air <- data.frame(passengers = as.numeric(AirPassengers),
                  year = rep(1949:1960, each = 12),
                  month = factor(rep(1:12, 12)))
air
##     passengers year month
## 1          112 1949     1
## 2          118 1949     2
## 3          132 1949     3
## 4          129 1949     4
## 5          121 1949     5
## 6          135 1949     6
## 7          148 1949     7
## 8          148 1949     8
## 9          136 1949     9
## 10         119 1949    10
## 11         104 1949    11
## 12         118 1949    12
## 13         115 1950     1
## 14         126 1950     2
## 15         141 1950     3
## 16         135 1950     4
## 17         125 1950     5
## 18         149 1950     6
## 19         170 1950     7
## 20         170 1950     8
## 21         158 1950     9
## 22         133 1950    10
## 23         114 1950    11
## 24         140 1950    12
## 25         145 1951     1
## 26         150 1951     2
## 27         178 1951     3
## 28         163 1951     4
## 29         172 1951     5
## 30         178 1951     6
## 31         199 1951     7
## 32         199 1951     8
## 33         184 1951     9
## 34         162 1951    10
## 35         146 1951    11
## 36         166 1951    12
## 37         171 1952     1
## 38         180 1952     2
## 39         193 1952     3
## 40         181 1952     4
## 41         183 1952     5
## 42         218 1952     6
## 43         230 1952     7
## 44         242 1952     8
## 45         209 1952     9
## 46         191 1952    10
## 47         172 1952    11
## 48         194 1952    12
## 49         196 1953     1
## 50         196 1953     2
## 51         236 1953     3
## 52         235 1953     4
## 53         229 1953     5
## 54         243 1953     6
## 55         264 1953     7
## 56         272 1953     8
## 57         237 1953     9
## 58         211 1953    10
## 59         180 1953    11
## 60         201 1953    12
## 61         204 1954     1
## 62         188 1954     2
## 63         235 1954     3
## 64         227 1954     4
## 65         234 1954     5
## 66         264 1954     6
## 67         302 1954     7
## 68         293 1954     8
## 69         259 1954     9
## 70         229 1954    10
## 71         203 1954    11
## 72         229 1954    12
## 73         242 1955     1
## 74         233 1955     2
## 75         267 1955     3
## 76         269 1955     4
## 77         270 1955     5
## 78         315 1955     6
## 79         364 1955     7
## 80         347 1955     8
## 81         312 1955     9
## 82         274 1955    10
## 83         237 1955    11
## 84         278 1955    12
## 85         284 1956     1
## 86         277 1956     2
## 87         317 1956     3
## 88         313 1956     4
## 89         318 1956     5
## 90         374 1956     6
## 91         413 1956     7
## 92         405 1956     8
## 93         355 1956     9
## 94         306 1956    10
## 95         271 1956    11
## 96         306 1956    12
## 97         315 1957     1
## 98         301 1957     2
## 99         356 1957     3
## 100        348 1957     4
## 101        355 1957     5
## 102        422 1957     6
## 103        465 1957     7
## 104        467 1957     8
## 105        404 1957     9
## 106        347 1957    10
## 107        305 1957    11
## 108        336 1957    12
## 109        340 1958     1
## 110        318 1958     2
## 111        362 1958     3
## 112        348 1958     4
## 113        363 1958     5
## 114        435 1958     6
## 115        491 1958     7
## 116        505 1958     8
## 117        404 1958     9
## 118        359 1958    10
## 119        310 1958    11
## 120        337 1958    12
## 121        360 1959     1
## 122        342 1959     2
## 123        406 1959     3
## 124        396 1959     4
## 125        420 1959     5
## 126        472 1959     6
## 127        548 1959     7
## 128        559 1959     8
## 129        463 1959     9
## 130        407 1959    10
## 131        362 1959    11
## 132        405 1959    12
## 133        417 1960     1
## 134        391 1960     2
## 135        419 1960     3
## 136        461 1960     4
## 137        472 1960     5
## 138        535 1960     6
## 139        622 1960     7
## 140        606 1960     8
## 141        508 1960     9
## 142        461 1960    10
## 143        390 1960    11
## 144        432 1960    12

Looking at the average trend per year

plot(with(air, tapply(passengers, year, mean)) ~ I(1949:1960),
     ylab = "Mean number of passengers", xlab = "Year", type = "b")

Looking at the average trend per month

plot(with(air, tapply(passengers, month, mean)) ~ I(1:12),
     ylab = "Mean number of passengers", xlab = "Month", type = "b")

Simple fit

(mod_air <- lm(passengers ~ year * month, data = air))
## 
## Call:
## lm(formula = passengers ~ year * month, data = air)
## 
## Coefficients:
##  (Intercept)          year        month2        month3        month4        month5        month6  
##   -5.407e+04     2.779e+01     6.356e+03     2.129e+02    -3.118e+03    -7.275e+03    -1.786e+04  
##       month7        month8        month9       month10       month11       month12   year:month2  
##   -2.994e+04    -2.936e+04    -1.232e+04    -5.237e+03     3.060e+03    -1.094e+03    -3.255e+00  
##  year:month3   year:month4   year:month5   year:month6   year:month7   year:month8   year:month9  
##   -9.441e-02     1.608e+00     3.738e+00     9.171e+00     1.537e+01     1.508e+01     6.336e+00  
## year:month10  year:month11  year:month12  
##    2.692e+00    -1.570e+00     5.699e-01

The problem

plot(residuals(mod_air), type = "b")
abline(h = 0, lty = 2, col = "red")

The problem

lmtest::dwtest(mod_air)
## 
##  Durbin-Watson test
## 
## data:  mod_air
## DW = 0.35903, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0

The problem

acf(residuals(mod_air))

Solution

library(nlme)
MAR1 <- corAR1(value = 0.5, form = ~ 1|year, fixed = FALSE)
MAR1 <- Initialize(MAR1, data = air)
round(corMatrix(MAR1)[["1950"]], 2)
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
##  [1,] 1.00 0.50 0.25 0.13 0.06 0.03 0.02 0.01 0.00  0.00  0.00  0.00
##  [2,] 0.50 1.00 0.50 0.25 0.13 0.06 0.03 0.02 0.01  0.00  0.00  0.00
##  [3,] 0.25 0.50 1.00 0.50 0.25 0.13 0.06 0.03 0.02  0.01  0.00  0.00
##  [4,] 0.13 0.25 0.50 1.00 0.50 0.25 0.13 0.06 0.03  0.02  0.01  0.00
##  [5,] 0.06 0.13 0.25 0.50 1.00 0.50 0.25 0.13 0.06  0.03  0.02  0.01
##  [6,] 0.03 0.06 0.13 0.25 0.50 1.00 0.50 0.25 0.13  0.06  0.03  0.02
##  [7,] 0.02 0.03 0.06 0.13 0.25 0.50 1.00 0.50 0.25  0.13  0.06  0.03
##  [8,] 0.01 0.02 0.03 0.06 0.13 0.25 0.50 1.00 0.50  0.25  0.13  0.06
##  [9,] 0.00 0.01 0.02 0.03 0.06 0.13 0.25 0.50 1.00  0.50  0.25  0.13
## [10,] 0.00 0.00 0.01 0.02 0.03 0.06 0.13 0.25 0.50  1.00  0.50  0.25
## [11,] 0.00 0.00 0.00 0.01 0.02 0.03 0.06 0.13 0.25  0.50  1.00  0.50
## [12,] 0.00 0.00 0.00 0.00 0.01 0.02 0.03 0.06 0.13  0.25  0.50  1.00

Solution

(mod_air2 <- lme(passengers ~ month * year, random = ~ 1 | year, data = air,
                 correlation = MAR1, method = "REML"))
## Linear mixed-effects model fit by REML
##   Data: air 
##   Log-restricted-likelihood: -477.871
##   Fixed: passengers ~ month * year 
##   (Intercept)        month2        month3        month4        month5        month6        month7 
## -5.406738e+04  6.355626e+03  2.129324e+02 -3.118268e+03 -7.275373e+03 -1.785545e+04 -2.993915e+04 
##        month8        month9       month10       month11       month12          year   month2:year 
## -2.935851e+04 -1.232239e+04 -5.237282e+03  3.059512e+03 -1.093845e+03  2.778671e+01 -3.255245e+00 
##   month3:year   month4:year   month5:year   month6:year   month7:year   month8:year   month9:year 
## -9.440559e-02  1.608392e+00  3.737762e+00  9.171329e+00  1.537413e+01  1.507692e+01  6.335664e+00 
##  month10:year  month11:year  month12:year 
##  2.692308e+00 -1.569930e+00  5.699301e-01 
## 
## Random effects:
##  Formula: ~1 | year
##         (Intercept) Residual
## StdDev:     13.4979  8.34667
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | year 
##  Parameter estimate(s):
##       Phi 
## 0.3270032 
## Number of Observations: 144
## Number of Groups: 12

Alternative code

(mod_air2b <- lme(passengers ~ month * year, random = ~ 1 | year, data = air,
                 correlation = corAR1(form = ~ 1|year), method = "REML"))
## Linear mixed-effects model fit by REML
##   Data: air 
##   Log-restricted-likelihood: -477.871
##   Fixed: passengers ~ month * year 
##   (Intercept)        month2        month3        month4        month5        month6        month7 
## -5.406738e+04  6.355626e+03  2.129324e+02 -3.118268e+03 -7.275373e+03 -1.785545e+04 -2.993915e+04 
##        month8        month9       month10       month11       month12          year   month2:year 
## -2.935851e+04 -1.232239e+04 -5.237282e+03  3.059512e+03 -1.093845e+03  2.778671e+01 -3.255245e+00 
##   month3:year   month4:year   month5:year   month6:year   month7:year   month8:year   month9:year 
## -9.440559e-02  1.608392e+00  3.737762e+00  9.171329e+00  1.537413e+01  1.507692e+01  6.335664e+00 
##  month10:year  month11:year  month12:year 
##  2.692308e+00 -1.569930e+00  5.699301e-01 
## 
## Random effects:
##  Formula: ~1 | year
##         (Intercept) Residual
## StdDev:     13.4979 8.346669
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | year 
##  Parameter estimate(s):
##       Phi 
## 0.3270031 
## Number of Observations: 144
## Number of Groups: 12

Testing the temporal autocorrelation

mod_air3 <- lme(passengers ~ month * year, random = ~ 1 | year, data = air, method = "REML")
anova(mod_air2, mod_air3)
##          Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mod_air2     1 27 1009.742 1085.004 -477.8710                        
## mod_air3     2 26 1017.362 1089.837 -482.6811 1 vs 2 9.620272  0.0019

Alternative autocorrelation structures

mod_airARMA1 <- update(mod_air2, correlation = corARMA(form = ~ 1 | year, p = 1, q = 0))
mod_airARMA2 <- update(mod_air2, correlation = corARMA(form = ~ 1 | year, p = 4, q = 0))
mod_airARMA3 <- update(mod_air2, correlation = corARMA(form = ~ 1 | year, p = 2, q = 2))

rbind(mod_air2 = AIC(mod_air2),
      mod_airARMA1 = AIC(mod_airARMA1),
      mod_airARMA2 = AIC(mod_airARMA2),
      mod_airARMA3 = AIC(mod_airARMA3))
##                  [,1]
## mod_air2     1009.742
## mod_airARMA1 1009.742
## mod_airARMA2 1008.911
## mod_airARMA3 1011.174


Note: do not compare AICs or likelihoods from nlme to those from other packages!

(it seems they have failed to consider a constant term…)

Fitted values

mod_air4 <- update(mod_air2, correlation = corARMA(form = ~ 1 | year, p = 4, q = 0), method = "ML")
data.for.plot <- expand.grid(month = factor(1:12), year = 1949:1960)
data.for.plot$obs <- air$passengers
data.for.plot$time <- seq(1949, 1960, length = (1960 - 1949 + 1) * 12)
data.for.plot$fit_lm <- predict(mod_air)
data.for.plot$fit_lme <- predict(mod_air4)

Fitted values

plot(obs ~ time, data = data.for.plot, type = "l", ylim = c(0, 700), ylab = "Passengers")
points(fit_lm ~ time, data = data.for.plot, type = "l", col = "red")
points(fit_lme ~ time, data = data.for.plot, type = "l", col = "blue")

Better, but good enough?

plot(residuals(mod_air4), type = "l")
abline(h = 0, lty = 2, col = "red")

Temporal autocorrelation in continuous time

The nlme::BodyWeight dataset

data("BodyWeight", package = "nlme")
plot(BodyWeight)

The nlme::BodyWeight dataset

body <- as.data.frame(BodyWeight)
body$Rat <- factor(body$Rat, levels = 1:16, order = FALSE)
str(body)
## 'data.frame':    176 obs. of  4 variables:
##  $ weight: num  240 250 255 260 262 258 266 266 265 272 ...
##  $ Time  : num  1 8 15 22 29 36 43 44 50 57 ...
##  $ Rat   : Factor w/ 16 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Diet  : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
unique(body$Time)
##  [1]  1  8 15 22 29 36 43 44 50 57 64

Fitting the model

(mod_rat1 <- lme(weight ~ Diet * Time, random = ~ Time|Rat, data = body))
## Linear mixed-effects model fit by REML
##   Data: body 
##   Log-restricted-likelihood: -575.8599
##   Fixed: weight ~ Diet * Time 
## (Intercept)       Diet2       Diet3        Time  Diet2:Time  Diet3:Time 
## 251.6516516 200.6654865 252.0716778   0.3596391   0.6058392   0.2983375 
## 
## Random effects:
##  Formula: ~Time | Rat
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev     Corr  
## (Intercept) 36.9390723 (Intr)
## Time         0.2484113 -0.149
## Residual     4.4436052       
## 
## Number of Observations: 176
## Number of Groups: 16

Checking residuals

plot(mod_rat1) ## there is some homoscedasticity but we will ignore it for now

Checking residuals

plot(residuals(mod_rat1), type = "b")

Fitting continuous temporal autocorrelation

(mod_rat2 <- lme(weight ~ Diet * Time, random = ~ Time|Rat, correlation = corExp(form = ~ Time), data = body))
## Linear mixed-effects model fit by REML
##   Data: body 
##   Log-restricted-likelihood: -566.0296
##   Fixed: weight ~ Diet * Time 
## (Intercept)       Diet2       Diet3        Time  Diet2:Time  Diet3:Time 
## 251.5924463 200.6889077 252.3152061   0.3602961   0.6255177   0.3109452 
## 
## Random effects:
##  Formula: ~Time | Rat
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev     Corr  
## (Intercept) 36.9744887 (Intr)
## Time         0.2434607 -0.149
## Residual     4.5501283       
## 
## Correlation Structure: Exponential spatial correlation
##  Formula: ~Time | Rat 
##  Parameter estimate(s):
##    range 
## 3.650519 
## Number of Observations: 176
## Number of Groups: 16

Model comparison

anova(mod_rat1, mod_rat2)
##          Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mod_rat1     1 10 1171.720 1203.078 -575.8599                        
## mod_rat2     2 11 1154.059 1188.553 -566.0296 1 vs 2 19.66057  <.0001


Note: the comparison makes sense as models are nested and fitted with REML.

Alternative correlation structures

mod_rat3 <- update(mod_rat1, corr = corExp(form = ~ Time, nugget = TRUE))
mod_rat4 <- update(mod_rat1, corr = corRatio(form = ~ Time))
mod_rat5 <- update(mod_rat1, corr = corSpher(form = ~ Time))
mod_rat6 <- update(mod_rat1, corr = corLin(form = ~ Time))
mod_rat7 <- update(mod_rat1, corr = corGaus(form = ~ Time))

rbind(mod_rat2 = AIC(mod_rat2),
      mod_rat3 = AIC(mod_rat3),
      mod_rat4 = AIC(mod_rat4),
      mod_rat5 = AIC(mod_rat5),
      mod_rat6 = AIC(mod_rat6),
      mod_rat7 = AIC(mod_rat7))
##              [,1]
## mod_rat2 1154.059
## mod_rat3 1151.768
## mod_rat4 1155.749
## mod_rat5 1157.233
## mod_rat6 1157.233
## mod_rat7 1157.233

The best model

mod_rat3
## Linear mixed-effects model fit by REML
##   Data: body 
##   Log-restricted-likelihood: -563.8841
##   Fixed: weight ~ Diet * Time 
## (Intercept)       Diet2       Diet3        Time  Diet2:Time  Diet3:Time 
## 251.3005207 200.6582307 253.3068385   0.3623183   0.6421429   0.3063663 
## 
## Random effects:
##  Formula: ~Time | Rat
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev     Corr  
## (Intercept) 36.9431391 (Intr)
## Time         0.2300954 -0.162
## Residual     5.7590445       
## 
## Correlation Structure: Exponential spatial correlation
##  Formula: ~Time | Rat 
##  Parameter estimate(s):
##     range    nugget 
## 17.150214  0.175613 
## Number of Observations: 176
## Number of Groups: 16

Same fit with spaMM

(mod_rat_spaMM <- corrHLfit(weight ~ Diet * Time + (Time|Rat) + Matern(1|Time), data = body,
                            HLmethod = "REML", init.corrHLfit = list(Nugget = 0.1), ranFix = list(nu = 0.5)))
## formula: weight ~ Diet * Time + (Time | Rat) + Matern(1 | Time)
## REML: Estimation of lambda, phi and corrPars by REML.
##       Estimation of fixed effects by ML.
## Family: gaussian ( link = identity ) 
##  ------------ Fixed effects (beta) ------------
##             Estimate Cond. SE t-value
## (Intercept) 251.6434 13.15460  19.130
## Diet2       200.6655 22.67952   8.848
## Diet3       252.0717 22.67952  11.115
## Time          0.3682  0.09668   3.808
## Diet2:Time    0.6058  0.15786   3.838
## Diet3:Time    0.2983  0.15786   1.890
##  --------------- Random effects ---------------
## Family: gaussian ( link = identity ) 
##                    --- Correlation parameters:
##      2.nu  2.Nugget     2.rho 
## 0.5000000 0.0000000 0.1695018 
##          --- Random-coefficients Cov matrices:
##  Group        Term   Var.   Corr.
##    Rat (Intercept)   1366        
##    Rat        Time 0.0624 -0.1507
##            --- Variance parameters ('lambda'):
## lambda = var(u) for u ~ Gaussian; 
##    Time  :  2.844  
##              --- Coefficients for log(lambda):
##  Group        Term Estimate Cond.SE
##   Time (Intercept)    1.045  0.5868
## # of obs: 176; # of groups: Rat, 32; Time, 11 
##  ------------- Residual variance  -------------
## Coefficients for log(phi) ~ 1 :
##             Estimate Cond. SE
## (Intercept)    2.826     0.12
## Estimate of phi=residual var:  16.88 
##  ------------- Likelihood values  -------------
##                         logLik
## p_v(h) (marginal L): -577.0637
##   p_beta,v(h) (ReL): -569.5571

Fitted values: nlme vs spaMM

plot(predict(mod_rat_spaMM), predict(mod_rat3))
abline(0, 1, col = "red")

Better fit with spaMM?

mod_rat_spaMM2 <- corrHLfit(weight ~ Diet * Time + (Time|Rat) + Matern(1|Time), data = body,
                            HLmethod = "REML", init.corrHLfit = list(Nugget = 0))

mod_rat_spaMM3 <- corrHLfit(weight ~ Diet * Time + (Time|Rat) + Matern(1|Time), data = body,
                            HLmethod = "REML", ranFix = list(Nugget = 0, nu = 0.5))

Comparison between exponential and unconstrained Matern

print(AIC(mod_rat_spaMM))
##        marginal AIC: 1176.1275
##     conditional AIC: 1035.8757
##      dispersion AIC: 1149.1142
##        effective df:  139.0042
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            1176.1275            1035.8757            1149.1142             139.0042
print(AIC(mod_rat_spaMM2))
##        marginal AIC: 1175.7517
##     conditional AIC: 1035.7574
##      dispersion AIC: 1148.7629
##        effective df:  139.3773
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            1175.7517            1035.7574            1148.7629             139.3773
print(AIC(mod_rat_spaMM3))
##        marginal AIC: 1176.1275
##     conditional AIC: 1035.8757
##      dispersion AIC: 1149.1142
##        effective df:  139.0042
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            1176.1275            1035.8757            1149.1142             139.0042

Displaying the continuous correlation function

time.pred <- seq(0, 64, 0.1)
corr <- MaternCorr(time.pred, rho = mod_rat_spaMM$corrPars$rho, nu = 0.5, Nugget = mod_rat_spaMM$corrPars$Nugget)
plot(corr ~ time.pred, xlab = "Time (days)", ylab = "Correlation", type = "l")

Testing the overall effect of diet

spaMM

mod_rat_spaMM3ML <- corrHLfit(weight ~ Diet * Time + (Time|Rat) + Matern(1|Time), data = body,
                            HLmethod = "ML", ranFix = list(Nugget = 0, nu = 0.5))
mod_rat_no_diet <- corrHLfit(weight ~ 1 + Time + (Time|Rat) + Matern(1|Time), data = body,
                            HLmethod = "ML", ranFix = list(Nugget = 0, nu = 0.5))
anova(mod_rat_spaMM3ML, mod_rat_no_diet)
##     chi2_LR df      p_value
## p_v 47.5844  4 1.152111e-09
c(logLik(mod_rat_spaMM3ML), logLik(mod_rat_no_diet))
##       p_v       p_v 
## -576.8528 -600.6450

Testing the overall effect of diet

nlme

mod_rat3ML <- lme(weight ~ Diet * Time, random = ~ Time|Rat,
                  correlation = corExp(form = ~ Time, nugget = TRUE), data = body, method = "ML")

mod_rat_no_diet2 <- lme(weight ~ 1 + Time, random = ~ Time|Rat,
                        correlation = corExp(form = ~ Time, nugget = TRUE), data = body, method = "ML")
anova(mod_rat3ML, mod_rat_no_diet2)
##                  Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mod_rat3ML           1 12 1165.962 1204.007 -570.9808                        
## mod_rat_no_diet2     2  8 1206.618 1231.982 -595.3088 1 vs 2 48.65601  <.0001

Revisiting the airplane passengers with spaMM::fitme()

air$time <- seq(1949, 1960, length = (1960 - 1949 + 1) * 12)

mod_air_spaMM1 <- fitme(passengers ~ month * year + Matern(1|time), data = air, method = "REML")

mod_air_spaMM2 <- fitme(passengers ~ month * year + Matern(1|time), data = air, method = "REML",
                        init = list(Nugget = 0))

mod_air_spaMM3 <- fitme(passengers ~ month * year + Matern(1|time), data = air, method = "REML",
                        init = list(Nugget = 0), fixed = list(nu = 0.5))

mod_air_spaMM4 <- fitme(passengers ~ month * year + Matern(1|time), data = air, method = "REML",
                        fixed = list(nu = 0.5, Nugget = 0))

Models comparison (NOTE: CHECK THIS AGAIN LATER WITH FRANCOIS.. I changed the model to include the interaction and results are now strange…)

print(AIC(mod_air_spaMM1))
##        marginal AIC: 1059.53220
##     conditional AIC: 1051.05825
##      dispersion AIC:  944.89040
##        effective df:   40.35844
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##           1059.53220           1051.05825            944.89040             40.35844
print(AIC(mod_air_spaMM2))
##        marginal AIC:  1059.5322
##     conditional AIC: -2253.7446
##      dispersion AIC:   944.8904
##        effective df:  1607.3426
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            1059.5322           -2253.7446             944.8904            1607.3426
print(AIC(mod_air_spaMM3))
##        marginal AIC: 1059.8193
##     conditional AIC:  433.1111
##      dispersion AIC:  944.9038
##        effective df:  352.6168
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            1059.8193             433.1111             944.9038             352.6168
print(AIC(mod_air_spaMM4))
##        marginal AIC: 1059.8193
##     conditional AIC:  433.3008
##      dispersion AIC:  944.9038
##        effective df:  352.5219
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            1059.8193             433.3008             944.9038             352.5219

Examining the best model

mod_air_spaMM2
## formula: passengers ~ month * year + Matern(1 | time)
## REML: Estimation of lambda, phi and corrPars by REML.
##       Estimation of fixed effects by ML.
## Family: gaussian ( link = identity ) 
##  ------------ Fixed effects (beta) ------------
##                Estimate  Cond. SE  t-value
## (Intercept)  -5.489e+04 8334.8129  -6.5855
## month2        6.660e+03 1524.5606   4.3685
## month3        8.221e+02 1663.6994   0.4942
## month4       -2.204e+03 1765.2063  -1.2486
## month5       -6.056e+03 1836.5619  -3.2974
## month6       -1.633e+04 1881.7191  -8.6786
## month7       -2.811e+04 1902.7834 -14.7725
## month8       -2.722e+04 1900.6314 -14.3229
## month9       -9.880e+03 1875.1205  -5.2692
## month10      -2.489e+03 1825.0411  -1.3637
## month11       6.115e+03 1747.7615   3.4986
## month12       2.269e+03 1638.1640   1.3853
## year          2.822e+01    4.2632   6.6189
## month2:year  -3.411e+00    0.7800  -4.3729
## month3:year  -4.061e-01    0.8512  -0.4771
## month4:year   1.141e+00    0.9031   1.2630
## month5:year   3.114e+00    0.9397   3.3138
## month6:year   8.391e+00    0.9628   8.7158
## month7:year   1.444e+01    0.9735  14.8301
## month8:year   1.398e+01    0.9724  14.3805
## month9:year   5.086e+00    0.9594   5.3016
## month10:year  1.286e+00    0.9338   1.3774
## month11:year -3.133e+00    0.8942  -3.5038
## month12:year -1.151e+00    0.8381  -1.3730
##  --------------- Random effects ---------------
## Family: gaussian ( link = identity ) 
##                    --- Correlation parameters:
##    1.Nugget        1.nu       1.rho 
## 0.001617956 0.479081102 0.009154930 
##            --- Variance parameters ('lambda'):
## lambda = var(u) for u ~ Gaussian; 
##    time  :  11980  
##              --- Coefficients for log(lambda):
##  Group        Term Estimate Cond.SE
##   time (Intercept)    9.391  0.1494
## # of obs: 144; # of groups: time, 144 
##  ------------- Residual variance  -------------
## Coefficients for log(phi) ~ 1 :
##             Estimate Cond. SE
## (Intercept)     2.61   0.2566
## Estimate of phi=residual var:  13.61 
##  ------------- Likelihood values  -------------
##                         logLik
## p_v(h) (marginal L): -503.7661
##   p_beta,v(h) (ReL): -470.4452

Fitted values

data.for.plot$pred_spaMM <- predict(mod_air_spaMM2)
plot(obs ~ time, data = data.for.plot, type = "l", lwd = 3, ylim = c(0, 700), ylab = "Passengers")
points(pred_spaMM ~ time, data = data.for.plot, type = "l", col = "green")

Note: never extrapolate using such model! The perfect fit is not unusual.

Testing the effect of years

spaMM

mod_air_spaMM2ML <- fitme(passengers ~ month + year + Matern(1|time), data = air, method = "ML",
                        init = list(Nugget = 0))
## (This warning will show only once)
##  Low (<1e-12) fitted residual variance (phi): this may be a genuine result
##  for data without appropriate replicates and a model that allows overfitting, but
##  (1) this may also point to problems in the data (duplicated response values?);
##  (2) this may crash later computations. You may overcome this by setting
##      control.HLfit$min_phi to 1e-10 or some other low, but not too low, value.
##      Still, the computed likelihood maximum wrt all parameters may be inaccurate.
mod_air_no_year <- fitme(passengers ~ month + Matern(1|time), data = air, method = "ML",
                        init = list(Nugget = 0))
anova(mod_air_spaMM2ML, mod_air_no_year)
##      chi2_LR df      p_value
## p_v 38.93713  1 4.376773e-10
c(logLik(mod_air_spaMM2ML), logLik(mod_air_no_year))
##       p_v       p_v 
## -593.5887 -613.0573

Testing the effect of years

nlme

mod_air3ML <- lme(passengers ~ month + year, random = ~ 1 | year, data = air,
                 correlation = corARMA(p = 6, q = 0), method = "ML")

mod_air_no_year2 <- lme(passengers ~ month, random = ~ 1 | year, data = air,
                 correlation = corARMA(p = 6, q = 0), method = "ML")
anova(mod_air3ML, mod_air_no_year2)
##                  Model df      AIC      BIC    logLik   Test L.Ratio p-value
## mod_air3ML           1 21 1209.245 1271.611 -583.6224                       
## mod_air_no_year2     2 20 1258.970 1318.367 -609.4851 1 vs 2 51.7254  <.0001

Spatial autocorrelation

Maximum normalised-difference vegetation index in north Cameroon

data("Loaloa")
ndvi <- Loaloa[, c("maxNDVI", "latitude", "longitude")]
head(ndvi)
##    maxNDVI latitude longitude
## X1    0.69 5.736750  8.041860
## X2    0.74 5.680280  8.004330
## X3    0.79 5.347222  8.905556
## X4    0.67 5.917420  8.100720
## X5    0.85 5.104540  8.182510
## X6    0.80 5.355556  8.929167

Visualising the data

library(maps)
spaMMplot2D(x = ndvi$longitude, y = ndvi$latitude, z = ndvi$maxNDVI, add.map = TRUE,
            xlab = "Longitude", ylab = "Latitude", plot.title = title(main = "max NDVI"))

Visualising the data

pairs(ndvi)

Fitting the model

(mod_ndvi1 <- fitme(maxNDVI ~ 1 + Matern(1|longitude + latitude), data = ndvi, method = "REML"))
## formula: maxNDVI ~ 1 + Matern(1 | longitude + latitude)
## REML: Estimation of lambda, phi and corrPars by REML.
##       Estimation of fixed effects by ML.
## Family: gaussian ( link = identity ) 
##  ------------ Fixed effects (beta) ------------
##             Estimate Cond. SE t-value
## (Intercept)   0.8006  0.02391   33.48
##  --------------- Random effects ---------------
## Family: gaussian ( link = identity ) 
##                    --- Correlation parameters:
##      1.nu     1.rho 
## 0.4066726 0.9345780 
##            --- Variance parameters ('lambda'):
## lambda = var(u) for u ~ Gaussian; 
##    longitude.  :  0.004322  
##              --- Coefficients for log(lambda):
##       Group        Term Estimate Cond.SE
##  longitude. (Intercept)   -5.444  0.1229
## # of obs: 197; # of groups: longitude., 197 
##  ------------- Residual variance  -------------
## Coefficients for log(phi) ~ 1 :
##             Estimate Cond. SE
## (Intercept)   -8.219   0.1775
## Estimate of phi=residual var:  0.0002696 
##  ------------- Likelihood values  -------------
##                        logLik
## p_v(h) (marginal L): 378.1405
##   p_beta,v(h) (ReL): 375.3261

Predictions

mapMM(mod_ndvi1, add.map = TRUE, plot.title = title(xlab = "Longitude", ylab = "Latitude"))

Predictions

filled.mapMM(mod_ndvi1, add.map = TRUE, plot.title = title(xlab = "Longitude", ylab = "Latitude"))

Prediction uncertainty

x.for.pred <- seq(min(ndvi$longitude), max(ndvi$longitude), length.out = 100)
y.for.pred <- seq(min(ndvi$latitude), max(ndvi$latitude), length.out = 50)
data.for.pred <- expand.grid(longitude = x.for.pred, latitude = y.for.pred)
gridpred <- predict(mod_ndvi1, newdata = data.for.pred, variances = list(predVar = TRUE))
## 
  |                                                                                                
  |                                                                                          |   0%
  |                                                                                                
  |====================================                                                      |  40%
  |                                                                                                
  |========================================================================                  |  80%
  |                                                                                                
  |==========================================================================================| 100%
data.for.pred$predVar <- attr(gridpred, "predVar")
m <- matrix(data.for.pred$predVar, ncol = length(y.for.pred))

Prediction uncertainty

spaMM.filled.contour(x = x.for.pred, y = y.for.pred, z = m, plot.axes = {
  points(ndvi[, c("longitude", "latitude")])}, col = spaMM.colors(redshift = 3))

Non gaussian response

GLMM

GLM + LMM = GLMM

\[\begin{array}{lcl} \mu &=& g^{-1}(\eta)\\ \mu &=& g^{-1}(\mathbf{X}\beta + \mathbf{Z}b)\\ \end{array} \]

with (as for GLM):

  • \(\text{E}(\text{Y}) = \mu = g^{-1}(\eta)\)
  • \(\text{Var}(\text{Y}) = \phi\text{V}(\mu)\)


Note:

  • If \(g^{-1}\) is the identity function, \(\phi = \sigma^2\) and \(\text{V}(\mu) = 1\), we have the LMM.
  • If \(\mathbf{Z}b = 0\), we have the GLM.
  • If \(g^{-1}\) is the identity function, \(\phi = \sigma^2\), \(\text{V}(\mu) = 1\), and \(\mathbf{Z}b = 0\), we have the LM.

The LM2GLMM::Flatwork dataset

Flatwork
##    individual gender month shopping cleaning other absent
## 1           1      w   nov        4        8     3      0
## 2           2      w   nov       10        5    10      7
## 3           3      w   nov        3        1     2      4
## 4           4      w   nov       15        2     0      0
## 5           5      w   nov       15        9     4      0
## 6           6      m   nov        0        2     8      0
## 7           7      w   nov        5        6     6     10
## 8           8      m   nov        3        4     5     11
## 9           9      m   nov        5        7     6      4
## 10         10      m   nov        6        3     0      7
## 11          1      w   dec        7        6     3      1
## 12          2      w   dec        6        2     3      7
## 13          3      w   dec        5        7     1      5
## 14          4      w   dec       14        5     1      0
## 15          5      w   dec        5        5     1      0
## 16          6      m   dec        6        1     0      5
## 17          7      w   dec        3        1     0      1
## 18          8      m   dec        2        1     0      0
## 19          9      m   dec        4        0     0      5
## 20         10      m   dec        7        3     1      3
## 21          1      w   jan       11        3     3      1
## 22          2      w   jan        3        8     2      7
## 23          3      w   jan        7        7    12      4
## 24          4      w   jan        5        2     5      0
## 25          5      w   jan        6        6     0      0
## 26          6      m   jan       13        0     5      2
## 27          7      w   jan        2       16     6      6
## 28          8      m   jan        7        6     9      0
## 29          9      m   jan        0        4     2      5
## 30         10      m   jan        6        3     9      7
## 31          1      w   feb        8        3     1      7
## 32          2      w   feb        2        4     1      6
## 33          3      w   feb        1        7    16      1
## 34          4      w   feb        0        0     0      0
## 35          5      w   feb       12       10     6      0
## 36          6      m   feb        7        0     1      0
## 37          7      w   feb        3        2    10      6
## 38          8      m   feb        7        4     3      0
## 39          9      m   feb        0        0     3      7
## 40         10      m   feb       16        2     0      0
## 41          1      w   mar        0        0     0      0
## 42          2      w   mar       17        2     5     11
## 43          3      w   mar        0        6     3      3
## 44          4      w   mar       11        5     6     12
## 45          5      w   mar        8       15    12      0
## 46          6      m   mar       10        0    14      0
## 47          7      w   mar        6        5     6      7
## 48          8      m   mar        8        6     6      0
## 49          9      m   mar       11        0     0      8
## 50         10      m   mar       10        2     1      0
## 51          1      w   apr        4        2     0      0
## 52          2      w   apr        9        9    12      0
## 53          3      w   apr       19        3     2      3
## 54          4      w   apr       14        4     6      6
## 55          5      w   apr        3        1     5      0
## 56          6      m   apr        4        1     0      0
## 57          7      w   apr       10       13     8      0
## 58          8      m   apr        0        2     2     10
## 59          9      m   apr        1        6     0      0
## 60         10      m   apr        4        4     1      0

The LM2GLMM::Flatwork dataset

str(Flatwork)
## 'data.frame':    60 obs. of  7 variables:
##  $ individual: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ gender    : Factor w/ 2 levels "m","w": 2 2 2 2 2 1 2 1 1 1 ...
##  $ month     : Factor w/ 6 levels "apr","dec","feb",..: 6 6 6 6 6 6 6 6 6 6 ...
##  $ shopping  : int  4 10 3 15 15 0 5 3 5 6 ...
##  $ cleaning  : int  8 5 1 2 9 2 6 4 7 3 ...
##  $ other     : int  3 10 2 0 4 8 6 5 6 0 ...
##  $ absent    : int  0 7 4 0 0 0 10 11 4 7 ...

GLMM with lme4

(mod_glmm_lme4 <- glmer(shopping ~ gender + (1|individual) + (1|month), family = poisson(),
                        data = Flatwork))
## Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
##  Family: poisson  ( log )
## Formula: shopping ~ gender + (1 | individual) + (1 | month)
##    Data: Flatwork
##       AIC       BIC    logLik  deviance  df.resid 
##  421.7239  430.1012 -206.8619  413.7239        56 
## Random effects:
##  Groups     Name        Std.Dev.
##  individual (Intercept) 0.2261  
##  month      (Intercept) 0.0510  
## Number of obs: 60, groups:  individual, 10; month, 6
## Fixed Effects:
## (Intercept)      genderw  
##      1.7107       0.2159

GLMM with spaMM

(mod_glmm_spaMM <- fitme(shopping ~ gender + (1|individual) + (1|month), family = poisson(),
                        data = Flatwork, method = "ML"))
## formula: shopping ~ gender + (1 | individual) + (1 | month)
## Estimation of corrPars and lambda by Laplace ML approximation (p_v).
## Estimation of fixed effects by Laplace ML approximation (p_v).
## Estimation of lambda by 'outer' ML, maximizing p_v.
## Family: poisson ( link = log ) 
##  ------------ Fixed effects (beta) ------------
##             Estimate Cond. SE t-value
## (Intercept)   1.7107   0.1440  11.879
## genderw       0.2159   0.1813   1.191
##  --------------- Random effects ---------------
## Family: gaussian ( link = identity ) 
##            --- Variance parameters ('lambda'):
## lambda = var(u) for u ~ Gaussian; 
##    individual  :  0.05113 
##    month  :  0.002602  
## # of obs: 60; # of groups: individual, 10; month, 6 
##  ------------- Likelihood values  -------------
##                         logLik
## p_v(h) (marginal L): -206.8619

Checking residuals

library(DHARMa)
r <- simulateResiduals(mod_glmm_lme4)
plot(r)
## DHARMa::plotResiduals - low number of unique predictor values, consider setting asFactor = T
## DHARMa::plotResiduals - low number of unique predictor values, consider setting asFactor = T

Extra 0s?

barplot(table(Flatwork$shopping))

Extra 0s?

testZeroInflation(r)

## 
##  DHARMa zero-inflation test via comparison to expected zeros with simulation under H0 =
##  fitted model
## 
## data:  r
## ratioObsExp = 30.702, p-value < 2.2e-16
## alternative hypothesis: more

Binomial model

Flatwork$shopping_bin <- Flatwork$shopping > 0
(mod_glmm_lme4bin <- glmer(shopping_bin ~ gender + (1|individual) + (1|month), family = binomial(),
                        data = Flatwork))
## Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
##  Family: binomial  ( logit )
## Formula: shopping_bin ~ gender + (1 | individual) + (1 | month)
##    Data: Flatwork
##      AIC      BIC   logLik deviance df.resid 
##  50.2791  58.6565 -21.1396  42.2791       56 
## Random effects:
##  Groups     Name        Std.Dev. 
##  individual (Intercept) 8.887e-09
##  month      (Intercept) 4.127e-08
## Number of obs: 60, groups:  individual, 10; month, 6
## Fixed Effects:
## (Intercept)      genderw  
##      1.6094       0.7885

Checking residuals

r_bin <- simulateResiduals(mod_glmm_lme4bin)
plot(r_bin)
## DHARMa::plotResiduals - low number of unique predictor values, consider setting asFactor = T
## DHARMa::plotResiduals - low number of unique predictor values, consider setting asFactor = T

Overdispersion?

r_bin2 <- simulateResiduals(mod_glmm_lme4bin, refit = TRUE)  ## slow and convergence issues...
testOverdispersion(r_bin2)
## 
##  DHARMa nonparametric overdispersion test via comparison to simulation under H0 = fitted
##  model
## 
## data:  r_bin2
## dispersion = 1.2008, p-value = 0.064
## alternative hypothesis: overdispersion

Testing the gender effect

mod_glmm_lme4bin0 <- glmer(shopping_bin ~ 1 + (1|individual) + (1|month), family = binomial(),
                        data = Flatwork)

anova(mod_glmm_lme4bin, mod_glmm_lme4bin0)
## Data: Flatwork
## Models:
## mod_glmm_lme4bin0: shopping_bin ~ 1 + (1 | individual) + (1 | month)
## mod_glmm_lme4bin: shopping_bin ~ gender + (1 | individual) + (1 | month)
##                   Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
## mod_glmm_lme4bin0  3 49.228 55.511 -21.614   43.228                         
## mod_glmm_lme4bin   4 50.279 58.657 -21.140   42.279 0.9485      1     0.3301

Same with spaMM

mod_glmm_spaMMbin <- fitme(shopping_bin ~ gender + (1|individual) + (1|month), family = binomial(),
                        data = Flatwork)

mod_glmm_spaMMbin0 <- fitme(shopping_bin ~ 1 + (1|individual) + (1|month), family = binomial(),
                        data = Flatwork)

anova(mod_glmm_spaMMbin, mod_glmm_spaMMbin0)
##       chi2_LR df   p_value
## p_v 0.9485333  1 0.3300929

Is there an effect for the non-zeros?

This is not ideal, but we will try an analysis on the truncated distribution…

Flatwork_pos <- subset(Flatwork, Flatwork$shopping_bin)
barplot(table(Flatwork_pos$shopping))

Fitting models on truncated distributions

mod_glmm_lme4pos1 <- glmer(shopping ~ gender + (1|individual) + (1|month), family = poisson(),
                        data = Flatwork_pos)
mod_glmm_lme4pos2 <- glmer.nb(shopping ~ gender + (1|individual) + (1|month), data = Flatwork_pos)
mod_glmm_spaMMpos1 <- fitme(shopping ~ gender + (1|individual) + (1|month), family = poisson(),
                        data = Flatwork_pos)
mod_glmm_spaMMpos2 <- fitme(shopping ~ gender + (1|individual) + (1|month), family = spaMM::negbin(),
                        data = Flatwork_pos)
mod_glmm_spaMMpos3 <- fitme(shopping ~ gender + (1|individual) + (1|month), family = COMPoisson(),
                        data = Flatwork_pos)

c(AIC(mod_glmm_lme4pos1), AIC(mod_glmm_lme4pos2))
## [1] 328.9853 306.3538
print(c(AIC(mod_glmm_spaMMpos1), AIC(mod_glmm_spaMMpos2), AIC(mod_glmm_spaMMpos3)))
##        marginal AIC: 328.98531
##     conditional AIC: 318.21443
##      dispersion AIC: 324.98531
##        effective df:  43.82578
##        marginal AIC: 306.35383
##     conditional AIC: 302.35376
##      dispersion AIC: 302.35383
##        effective df:  50.99975
##        marginal AIC: 359.00219
##     conditional AIC: 347.61158
##      dispersion AIC: 355.00219
##        effective df:  41.75854
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            328.98531            318.21443            324.98531             43.82578 
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            306.35383            302.35376            302.35383             50.99975 
##        marginal AIC:     conditional AIC:      dispersion AIC:        effective df: 
##            359.00219            347.61158            355.00219             41.75854

Checking residuals

r_pos <- simulateResiduals(mod_glmm_lme4pos2)
plot(r_pos)
## DHARMa::plotResiduals - low number of unique predictor values, consider setting asFactor = T
## DHARMa::plotResiduals - low number of unique predictor values, consider setting asFactor = T

Testing again the gender effect

mod_glmm_spaMMpos20 <- fitme(shopping ~ 1 + (1|individual) + (1|month), family = spaMM::negbin(),
                        data = Flatwork_pos)

anova(mod_glmm_spaMMpos20, mod_glmm_spaMMpos2)
##       chi2_LR df   p_value
## p_v 0.4453586  1 0.5045474

Practice


What about gender and cleaning?

Non gaussian random effects

The Hierarchical GLM (HGLM)

\[\begin{array}{lcl} \mu &=& g^{-1}(\eta)\\ \mu &=& g^{-1}(\mathbf{X}\beta + \mathbf{Z}b)\\ \mu &=& g^{-1}(\mathbf{X}\beta + \mathbf{Z}f(u)) \end{array} \]


Note:

  • If \(f(u)\) is the identity function and \(u\) is drawn for a normal distribution, then we have the GLMM, a particular case of the more general HGLM.
  • Hence LM, GLM, LMM and GLMM are all particular cases of the HGLM.

The example of the negative binomial

library(MASS)

mod_negbin <- glm.nb(Days ~ Sex/Age, data = quine)

quine$index <- factor(1:nrow(quine))

mod_poiss_gamma <- fitme(Days ~ Sex/Age + (1|index), data = quine,
                         family = poisson(), rand.family = Gamma("log"))

rbind(mod_negbin$coefficients, mod_poiss_gamma$fixef)
##      (Intercept)       SexM SexF:AgeF1 SexM:AgeF1  SexF:AgeF2 SexM:AgeF2 SexF:AgeF3 SexM:AgeF3
## [1,]    2.928524 -0.3957609 -0.3659809 -0.5868525 -0.01502935  0.6211936 -0.2894662  0.7709794
## [2,]    2.928524 -0.3957609 -0.3659809 -0.5868525 -0.01502935  0.6211936 -0.2894662  0.7709794


Note: the equivalence is expected! In more complex models differences may appear.

The beta-binomial HGLM


\[ \begin{array}{lcl} \text{logit}(p) &=& \text{ln}\left(\frac{p}{1-p}\right) = \mathbf{X}\beta + \mathbf{Z}b\\ \text{logit}(b) &=& \text{ln}\left(\frac{u}{1-u}\right) \end{array} \]


with \(u\) following the beta distribution.


It can be useful to model heterogeneity in proportions!

The spaMM::seeds dataset

data(seeds)
seeds
##    plate seed  extract  r  n
## 1      1  O75     Bean 10 39
## 2      2  O75     Bean 23 62
## 3      3  O75     Bean 23 81
## 4      4  O75     Bean 26 51
## 5      5  O75     Bean 17 39
## 6      6  O73     Bean  8 16
## 7      7  O73     Bean 10 30
## 8      8  O73     Bean  8 28
## 9      9  O73     Bean 23 45
## 10    10  O73     Bean  0  4
## 11    11  O75 Cucumber  5  6
## 12    12  O75 Cucumber 53 74
## 13    13  O75 Cucumber 55 72
## 14    14  O75 Cucumber 32 51
## 15    15  O75 Cucumber 46 79
## 16    16  O75 Cucumber 10 13
## 17    17  O73 Cucumber  3 12
## 18    18  O73 Cucumber 22 41
## 19    19  O73 Cucumber 15 30
## 20    20  O73 Cucumber 32 51
## 21    21  O73 Cucumber  3  7

The spaMM::seeds dataset

coplot(r/n ~ plate | seed + extract, data = seeds)

Fitting the germination data using the HGLM

(mod_germ1 <- fitme(cbind(r, n - r) ~ seed * extract + (1|plate), family = binomial(), rand.family = Beta(),
                   data = seeds, method = "REML"))
## formula: cbind(r, n - r) ~ seed * extract + (1 | plate)
## Estimation of lambda by Laplace REML approximation (p_bv).
## Estimation of fixed effects by Laplace ML approximation (p_v).
## Family: binomial ( link = logit ) 
##  ------------ Fixed effects (beta) ------------
##                         Estimate Cond. SE t-value
## (Intercept)              -0.5483   0.1896 -2.8922
## seedO73                   0.0768   0.3070  0.2502
## extractCucumber           1.3526   0.2687  5.0330
## seedO73:extractCucumber  -0.8313   0.4279 -1.9429
##  --------------- Random effects ---------------
## Family: Beta ( link = logit ) 
##            --- Variance parameters ('lambda'):
## lambda = 4 var(u)/(1 - 4 var(u)) for u ~ Beta[1/(2*lambda),1/(2*lambda)]; 
##    plate  :  0.02378  
##              --- Coefficients for log(lambda):
##  Group        Term Estimate Cond.SE
##  plate (Intercept)   -3.739  0.5364
## # of obs: 21; # of groups: plate, 21 
##  ------------- Likelihood values  -------------
##                         logLik
## p_v(h) (marginal L): -54.05829
##   p_beta,v(h) (ReL): -56.59763

Comparison to the binomial GLMM

mod_germ2 <- fitme(cbind(r, n - r) ~ seed * extract + (1|plate), family = binomial(),
                   data = seeds, method = "REML")

print(rbind(AIC(mod_germ1), AIC(mod_germ2)))
##        marginal AIC: 118.1166
##     conditional AIC: 110.5305
##      dispersion AIC: 115.1953
##        effective df:  10.1160
##        marginal AIC: 118.076284
##     conditional AIC: 110.500869
##      dispersion AIC: 115.061157
##        effective df:   9.939376
##             marginal AIC:     conditional AIC:      dispersion AIC:        effective df:
## [1,]             118.1166             110.5305             115.1953            10.116005
## [2,]             118.0763             110.5009             115.0612             9.939376


Ok… here it does not do much difference, but it is still worth trying.

Heteroscedasticity

Let's revisit the rats

mod_rat_spaMM <- corrHLfit(weight ~ Diet * Time + (Time|Rat) + Matern(1|Time), data = body,
                            HLmethod = "REML", init.corrHLfit = list(Nugget = 0.1), ranFix = list(nu = 0.5))
coplot(residuals(mod_rat_spaMM) ~ I(1:nrow(body)) | body$Diet, show.given = FALSE)

Let's revisit the rats

mod_rat_hetero <- corrHLfit(weight ~ Diet * Time + (Time|Rat) + Matern(1|Time), data = body,
                           HLmethod = "REML", init.corrHLfit = list(Nugget = 0.1), ranFix = list(nu = 0.5),
                           resid.formula = ~ Diet)
summary.tables <- summary(mod_rat_hetero)
summary.tables$phi_table
##               Estimate  Cond. SE
## (Intercept)  2.6702155 0.1698757
## Diet2       -0.3221617 0.2961118
## Diet3        0.6638114 0.2918609
print(rbind(AIC(mod_rat_spaMM),
            AIC(mod_rat_hetero)))
##        marginal AIC: 1176.1275
##     conditional AIC: 1035.8757
##      dispersion AIC: 1149.1142
##        effective df:  139.0042
##        marginal AIC: 1170.7321
##     conditional AIC: 1027.6438
##      dispersion AIC: 1143.6682
##        effective df:  138.8131
##             marginal AIC:     conditional AIC:      dispersion AIC:        effective df:
## [1,]             1176.127             1035.876             1149.114             139.0042
## [2,]             1170.732             1027.644             1143.668             138.8131

Let's re-test the overal effect of the diet

mod_rat_hetero <- corrHLfit(weight ~ Diet * Time + (Time|Rat) + Matern(1|Time), data = body,
                           HLmethod = "ML", init.corrHLfit = list(Nugget = 0.1), ranFix = list(nu = 0.5),
                           resid.formula = ~ Diet)

mod_rat_hetero0 <- corrHLfit(weight ~ Time + (Time|Rat) + Matern(1|Time), data = body,
                           HLmethod = "ML", init.corrHLfit = list(Nugget = 0.1), ranFix = list(nu = 0.5),
                           resid.formula = ~ Diet)

anova(mod_rat_hetero, mod_rat_hetero0)
##      chi2_LR df      p_value
## p_v 47.73391  4 1.072348e-09

You can handle heteroscedasticity in simple models too!

set.seed(1L)
d <- data.frame(y = c(rnorm(100, mean = 10, sd = sqrt(10)),
                      rnorm(100, mean = 20, sd = sqrt(20))),
                group = factor(c(rep("A", 100), rep("B", 100))))

m <- fitme(y ~ group, resid.model = ~ group, data = d, method = "REML")
unique(m$phi)
## [1]  8.067621 18.350646

An example of many difficulties combined: IsoriX

What is IsoriX?

library(IsoriX)
## 
##  IsoriX version 0.7.1 is loaded!
## 
##  The names of the functions and objects are not yet stable.
##  We keep revising them to make IsoriX more intuitive for you to use.
##  We will do our best to limit changes in names from version 1.0 onward!!
## 
##  Type:
##     * ?IsoriX for a short description.
##     * browseVignettes(package = 'IsoriX') for tutorials.
##     * news(package = 'IsoriX') for news.

Loading the GNIPDataDE data

dim(GNIPDataDE)
## [1] 8591    7
head(GNIPDataDE)
##   stationID   lat long elev year month isoscape.value
## 1 SCHLESWIG 54.52 9.54   43 1997     6          -59.0
## 2 SCHLESWIG 54.52 9.54   43 1997     7          -56.0
## 3 SCHLESWIG 54.52 9.54   43 1997     8          -60.8
## 4 SCHLESWIG 54.52 9.54   43 1997     9          -51.0
## 5 SCHLESWIG 54.52 9.54   43 1997    10          -58.7
## 6 SCHLESWIG 54.52 9.54   43 1997    11          -74.6

Aggregate the GNIPDataDE

GNIPDataDE_agg <- prepdata(data = GNIPDataDE)
dim(GNIPDataDE_agg)
## [1] 27  7
head(GNIPDataDE_agg)
##       stationID isoscape.value var.isoscape.value n.isoscape.value   lat  long elev
## 1        ARKONA      -60.99231           247.8767              134 54.67 13.43   42
## 2        ARTERN      -61.00653           510.5720              199 51.37 11.29  164
## 3 BAD SALZUFLEN      -53.80770           310.1046              408 52.10  8.75  135
## 4        BERLIN      -57.53477           395.6484              419 52.46 13.40   48
## 5  BRAUNSCHWEIG      -52.73350           339.4752              420 52.29 10.44   81
## 6      CUXHAVEN      -49.25271           221.6594              420 53.87  8.70    5

IsoriX using IsoriX

Europefit <- isofit(iso.data = GNIPDataDE_agg, mean.model.fix = list(elev = TRUE, lat.abs = TRUE))
## [1] Fitting the following residual dispersion model using spaMM  :
## [1] "var.isoscape.value ~ 1 + (1|stationID) + Matern(1|long + lat)"
## [1] (it may take a while...)
## [1] Fitting the following mean model using spaMM  :
## [1] "isoscape.value ~ 1 + elev + lat.abs + (1|stationID) + Matern(1|long + lat)"
## [1] (it may take a while...)
## [1] Done!
## [1] "Models were fitted in 10 sec."

The data for the elevation

library(rasterVis)
plot(ElevRasterDE)

IsoriX using IsoriX

isoscape <- isoscape(elevation.raster = ElevRasterDE, isofit = Europefit)
## [1] Building the isoscapes... 
## [1] (this may take a while)
## 
  |                                                                                                
  |                                                                                          |   0%
  |                                                                                                
  |=                                                                                         |   1%
  |                                                                                                
  |==                                                                                        |   2%
  |                                                                                                
  |===                                                                                       |   4%
  |                                                                                                
  |====                                                                                      |   5%
  |                                                                                                
  |======                                                                                    |   6%
  |                                                                                                
  |=======                                                                                   |   7%
  |                                                                                                
  |========                                                                                  |   9%
  |                                                                                                
  |=========                                                                                 |  10%
  |                                                                                                
  |==========                                                                                |  11%
  |                                                                                                
  |===========                                                                               |  12%
  |                                                                                                
  |============                                                                              |  14%
  |                                                                                                
  |=============                                                                             |  15%
  |                                                                                                
  |==============                                                                            |  16%
  |                                                                                                
  |================                                                                          |  17%
  |                                                                                                
  |=================                                                                         |  19%
  |                                                                                                
  |==================                                                                        |  20%
  |                                                                                                
  |===================                                                                       |  21%
  |                                                                                                
  |====================                                                                      |  22%
  |                                                                                                
  |=====================                                                                     |  23%
  |                                                                                                
  |======================                                                                    |  25%
  |                                                                                                
  |=======================                                                                   |  26%
  |                                                                                                
  |========================                                                                  |  27%
  |                                                                                                
  |==========================                                                                |  28%
  |                                                                                                
  |===========================                                                               |  30%
  |                                                                                                
  |============================                                                              |  31%
  |                                                                                                
  |=============================                                                             |  32%
  |                                                                                                
  |==============================                                                            |  33%
  |                                                                                                
  |===============================                                                           |  35%
  |                                                                                                
  |================================                                                          |  36%
  |                                                                                                
  |=================================                                                         |  37%
  |                                                                                                
  |==================================                                                        |  38%
  |                                                                                                
  |====================================                                                      |  40%
  |                                                                                                
  |=====================================                                                     |  41%
  |                                                                                                
  |======================================                                                    |  42%
  |                                                                                                
  |=======================================                                                   |  43%
  |                                                                                                
  |========================================                                                  |  44%
  |                                                                                                
  |=========================================                                                 |  46%
  |                                                                                                
  |==========================================                                                |  47%
  |                                                                                                
  |===========================================                                               |  48%
  |                                                                                                
  |============================================                                              |  49%
  |                                                                                                
  |==============================================                                            |  51%
  |                                                                                                
  |===============================================                                           |  52%
  |                                                                                                
  |================================================                                          |  53%
  |                                                                                                
  |=================================================                                         |  54%
  |                                                                                                
  |==================================================                                        |  56%
  |                                                                                                
  |===================================================                                       |  57%
  |                                                                                                
  |====================================================                                      |  58%
  |                                                                                                
  |=====================================================                                     |  59%
  |                                                                                                
  |======================================================                                    |  60%
  |                                                                                                
  |========================================================                                  |  62%
  |                                                                                                
  |=========================================================                                 |  63%
  |                                                                                                
  |==========================================================                                |  64%
  |                                                                                                
  |===========================================================                               |  65%
  |                                                                                                
  |============================================================                              |  67%
  |                                                                                                
  |=============================================================                             |  68%
  |                                                                                                
  |==============================================================                            |  69%
  |                                                                                                
  |===============================================================                           |  70%
  |                                                                                                
  |================================================================                          |  72%
  |                                                                                                
  |==================================================================                        |  73%
  |                                                                                                
  |===================================================================                       |  74%
  |                                                                                                
  |====================================================================                      |  75%
  |                                                                                                
  |=====================================================================                     |  77%
  |                                                                                                
  |======================================================================                    |  78%
  |                                                                                                
  |=======================================================================                   |  79%
  |                                                                                                
  |========================================================================                  |  80%
  |                                                                                                
  |=========================================================================                 |  81%
  |                                                                                                
  |==========================================================================                |  83%
  |                                                                                                
  |============================================================================              |  84%
  |                                                                                                
  |=============================================================================             |  85%
  |                                                                                                
  |==============================================================================            |  86%
  |                                                                                                
  |===============================================================================           |  88%
  |                                                                                                
  |================================================================================          |  89%
  |                                                                                                
  |=================================================================================         |  90%
  |                                                                                                
  |==================================================================================        |  91%
  |                                                                                                
  |===================================================================================       |  93%
  |                                                                                                
  |====================================================================================      |  94%
  |                                                                                                
  |======================================================================================    |  95%
  |                                                                                                
  |=======================================================================================   |  96%
  |                                                                                                
  |========================================================================================  |  98%
  |                                                                                                
  |========================================================================================= |  99%
  |                                                                                                
  |==========================================================================================| 100%
## [1] predictions for all 12240 locations have been computed in 6 sec.
plot.mean <- plot(x = isoscape, which = "mean", plot = FALSE)

plot.disp <- plot(x = isoscape, which = "disp", plot = FALSE)

Mean prediction of the distribution of Deuterium

plot.mean

Prediction of the residual variance in Deuterium

plot.disp

IsoriX using spaMM

disp.fit <- fitme(var.isoscape.value ~ 1 + Matern(1 | long + lat), family = Gamma(log),
                  prior.weights = n.isoscape.value - 1, method = "REML", fixed = list(phi = 2),
                  control.dist = list(dist.method = "Earth"), data = GNIPDataDE_agg)
GNIPDataDE_agg$pred.disp <- predict(disp.fit)[, 1]
mean.fit <- fitme(isoscape.value ~ lat + elev + Matern(1 | long + lat), family = gaussian(), 
                  prior.weights = n.isoscape.value, method = "REML",
                  resid.model = list(formula = ~ 0 + offset(pred.disp), family = Gamma(identity)),
                  control.dist = list(dist.method = "Earth"), data = GNIPDataDE_agg)
Europefit2 <- list(mean.fit = mean.fit, disp.fit = disp.fit)

Predictions

isoscape2 <- isoscape(elevation.raster = ElevRasterDE, isofit = Europefit2)
## [1] Building the isoscapes... 
## [1] (this may take a while)
## 
  |                                                                                                
  |                                                                                          |   0%
  |                                                                                                
  |=                                                                                         |   1%
  |                                                                                                
  |==                                                                                        |   2%
  |                                                                                                
  |===                                                                                       |   4%
  |                                                                                                
  |====                                                                                      |   5%
  |                                                                                                
  |======                                                                                    |   6%
  |                                                                                                
  |=======                                                                                   |   7%
  |                                                                                                
  |========                                                                                  |   9%
  |                                                                                                
  |=========                                                                                 |  10%
  |                                                                                                
  |==========                                                                                |  11%
  |                                                                                                
  |===========                                                                               |  12%
  |                                                                                                
  |============                                                                              |  14%
  |                                                                                                
  |=============                                                                             |  15%
  |                                                                                                
  |==============                                                                            |  16%
  |                                                                                                
  |================                                                                          |  17%
  |                                                                                                
  |=================                                                                         |  19%
  |                                                                                                
  |==================                                                                        |  20%
  |                                                                                                
  |===================                                                                       |  21%
  |                                                                                                
  |====================                                                                      |  22%
  |                                                                                                
  |=====================                                                                     |  23%
  |                                                                                                
  |======================                                                                    |  25%
  |                                                                                                
  |=======================                                                                   |  26%
  |                                                                                                
  |========================                                                                  |  27%
  |                                                                                                
  |==========================                                                                |  28%
  |                                                                                                
  |===========================                                                               |  30%
  |                                                                                                
  |============================                                                              |  31%
  |                                                                                                
  |=============================                                                             |  32%
  |                                                                                                
  |==============================                                                            |  33%
  |                                                                                                
  |===============================                                                           |  35%
  |                                                                                                
  |================================                                                          |  36%
  |                                                                                                
  |=================================                                                         |  37%
  |                                                                                                
  |==================================                                                        |  38%
  |                                                                                                
  |====================================                                                      |  40%
  |                                                                                                
  |=====================================                                                     |  41%
  |                                                                                                
  |======================================                                                    |  42%
  |                                                                                                
  |=======================================                                                   |  43%
  |                                                                                                
  |========================================                                                  |  44%
  |                                                                                                
  |=========================================                                                 |  46%
  |                                                                                                
  |==========================================                                                |  47%
  |                                                                                                
  |===========================================                                               |  48%
  |                                                                                                
  |============================================                                              |  49%
  |                                                                                                
  |==============================================                                            |  51%
  |                                                                                                
  |===============================================                                           |  52%
  |                                                                                                
  |================================================                                          |  53%
  |                                                                                                
  |=================================================                                         |  54%
  |                                                                                                
  |==================================================                                        |  56%
  |                                                                                                
  |===================================================                                       |  57%
  |                                                                                                
  |====================================================                                      |  58%
  |                                                                                                
  |=====================================================                                     |  59%
  |                                                                                                
  |======================================================                                    |  60%
  |                                                                                                
  |========================================================                                  |  62%
  |                                                                                                
  |=========================================================                                 |  63%
  |                                                                                                
  |==========================================================                                |  64%
  |                                                                                                
  |===========================================================                               |  65%
  |                                                                                                
  |============================================================                              |  67%
  |                                                                                                
  |=============================================================                             |  68%
  |                                                                                                
  |==============================================================                            |  69%
  |                                                                                                
  |===============================================================                           |  70%
  |                                                                                                
  |================================================================                          |  72%
  |                                                                                                
  |==================================================================                        |  73%
  |                                                                                                
  |===================================================================                       |  74%
  |                                                                                                
  |====================================================================                      |  75%
  |                                                                                                
  |=====================================================================                     |  77%
  |                                                                                                
  |======================================================================                    |  78%
  |                                                                                                
  |=======================================================================                   |  79%
  |                                                                                                
  |========================================================================                  |  80%
  |                                                                                                
  |=========================================================================                 |  81%
  |                                                                                                
  |==========================================================================                |  83%
  |                                                                                                
  |============================================================================              |  84%
  |                                                                                                
  |=============================================================================             |  85%
  |                                                                                                
  |==============================================================================            |  86%
  |                                                                                                
  |===============================================================================           |  88%
  |                                                                                                
  |================================================================================          |  89%
  |                                                                                                
  |=================================================================================         |  90%
  |                                                                                                
  |==================================================================================        |  91%
  |                                                                                                
  |===================================================================================       |  93%
  |                                                                                                
  |====================================================================================      |  94%
  |                                                                                                
  |======================================================================================    |  95%
  |                                                                                                
  |=======================================================================================   |  96%
  |                                                                                                
  |========================================================================================  |  98%
  |                                                                                                
  |========================================================================================= |  99%
  |                                                                                                
  |==========================================================================================| 100%
## [1] predictions for all 12240 locations have been computed in 5 sec.
plot(x = isoscape2, which = "mean", plot = TRUE)

DHGLM

system.time(
  dhglm <- corrHLfit(isoscape.value ~ lat + elev + Matern(1 | long + lat), family = gaussian(), 
              HLmethod = "REML", data = GNIPDataDE, control.dist = list(dist.method = "Earth"),
              resid.model = list(formula = ~ 1 + Matern(1 | long + lat),
                                 control.dist = list(dist.method = "Earth"),
                                 family = Gamma(log), fixed = list(phi = 2)),
                                 verbose = c(TRACE = TRUE))
)

What you need to remember

  • how to handle temporal and spatial autocorrelation
  • that GLMM are not very difficult if you already know GLM and LMM
  • that random effects as well can have non Gaussian distribution
  • that there are even more general methods than GLMM out there: HGLM and DHGLM
  • how to handle heteroscedasticity

Table of contents

Mixed-effects models